Model Selection

Disentangled Attention Mechanism

# Disentangled Attention Mechanism

Deberta Xlarge Mnli

DeBERTa-XLarge-MNLI is an enhanced BERT model based on the disentangled attention mechanism, fine-tuned on the MNLI task with 750M parameters, excelling in natural language understanding tasks.

Large Language Model

Transformers English

DeBERTa is an improved BERT model that enhances performance through a disentangled attention mechanism and an enhanced masked decoder, surpassing BERT and RoBERTa in multiple natural language understanding tasks.

Large Language Model

Transformers English

Deberta V2 Xxlarge Mnli

DeBERTa V2 XXLarge is an enhanced BERT variant based on the disentangled attention mechanism, surpassing RoBERTa and XLNet in natural language understanding tasks with 1.5 billion parameters

Large Language Model

Transformers English

DeBERTa is an improved BERT model based on the disentangled attention mechanism and enhanced masked decoder, excelling in multiple natural language understanding tasks.

Large Language Model English

Deberta V2 Xxlarge

DeBERTa V2 XXLarge is an improved BERT model based on disentangled attention and enhanced mask decoding, with 1.5 billion parameters, surpassing BERT and RoBERTa performance on multiple natural language understanding tasks

Large Language Model

Transformers English

Deberta V2 Xlarge Mnli

DeBERTa V2 XLarge is an enhanced natural language understanding model developed by Microsoft, which improves the BERT architecture through a disentangled attention mechanism and enhanced masked decoder, outperforming BERT and RoBERTa on multiple NLU tasks.

Large Language Model

Transformers English

Deberta V2 Xlarge

DeBERTa V2 XXLarge is an enhanced natural language understanding model developed by Microsoft, which improves the BERT architecture through a disentangled attention mechanism and enhanced masked decoder, achieving SOTA performance on multiple NLP tasks.

Large Language Model

Transformers English

DeBERTa improves upon BERT and RoBERTa models with a disentangled attention mechanism and enhanced masked decoder, demonstrating superior performance in most natural language understanding tasks.

Large Language Model

Transformers English

Deberta Large Mnli

DeBERTa-V2-XXLarge is an improved BERT model based on the disentangled attention mechanism and enhanced masked decoder, excelling in multiple natural language understanding tasks.

Large Language Model

Transformers English

Deberta V3 Small

DeBERTa-v3 is an improved natural language understanding model developed by Microsoft, optimized through ELECTRA-style pretraining and gradient-disentangled embedding sharing technology to achieve efficient performance while maintaining a relatively small parameter count.

Large Language Model

Transformers English

Deberta V3 Xsmall

DeBERTaV3 is an improved version of the DeBERTa model proposed by Microsoft, which enhances efficiency through ELECTRA-style gradient-disentangled embedding sharing pretraining method, demonstrating excellent performance in natural language understanding tasks.

Large Language Model

Transformers English

Deberta V2 Xlarge

DeBERTa is an enhanced BERT decoding model based on the disentangled attention mechanism, surpassing the performance of BERT and RoBERTa on multiple natural language understanding tasks through improved attention mechanisms and enhanced masked decoders.

Large Language Model

Transformers English

DeBERTa is an enhanced BERT decoding model based on the disentangled attention mechanism, improving upon BERT and RoBERTa models, and excels in natural language understanding tasks.

Large Language Model

Transformers English

Deberta Base Mnli

Enhanced BERT decoding model based on disentangled attention mechanism, fine-tuned on MNLI task

Large Language Model English

Deberta V3 Large Mnli

DeBERTa-v3-large model trained on MultiNLI dataset for textual entailment relationship judgment

Text Classification

Transformers English

DeBERTa is an enhanced BERT improvement model based on the disentangled attention mechanism. With 160GB of training data and 1.5 billion parameters, it surpasses the performance of BERT and RoBERTa in multiple natural language understanding tasks.

Large Language Model

Transformers English

DeBERTa is an enhanced BERT model based on the disentangled attention mechanism, surpassing BERT and RoBERTa in multiple natural language understanding tasks

Large Language Model

Transformers English

Debertav3 Mnli Snli Anli

DeBERTa is an enhanced BERT decoding model based on the disentangled attention mechanism, which improves upon BERT and RoBERTa models and performs better in most natural language understanding tasks.

Large Language Model

Transformers English

DeBERTa is an enhanced BERT decoder model based on the disentangled attention mechanism, excelling in natural language understanding tasks.

Large Language Model

Transformers English

Deberta Large Mnli Zero Cls

DeBERTa is an enhanced BERT decoding model based on the disentangled attention mechanism, surpassing BERT and RoBERTa in multiple natural language understanding tasks by improving the attention mechanism and masked decoder.

Large Language Model

Transformers English

V2xl Again Mnli

DeBERTa is an enhanced BERT decoding model based on the disentangled attention mechanism. By improving the attention mechanism and masked decoder, it surpasses the performance of BERT and RoBERTa in multiple natural language understanding tasks.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase